AITopics | penultimate layer

Collaborating Authors

penultimate layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0525fa17a8dbea687359116d01732e12-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 09:16:17 GMT

artificial intelligence, fourier transform, regularisation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

Neural Information Processing SystemsFeb-16-2026, 11:45:08 GMT

In recent years, concept-based approaches have emerged as some of the most promising explainability methods to help us interpret the decisions of Artificial Neural Networks (ANNs).

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Maryland (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

3effb91593c4fb42b1da1528328eff49-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 17:47:51 GMT

artificial intelligence, machine learning, semanticscholar, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

804dbf8d3b8eee1ef875c6857efc64eb-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 07:03:16 GMT

dataset, equation, estimation, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

RankFeat: Rank-1FeatureRemovalfor Out-of-distributionDetection-SupplementaryMaterial-AExperimentalSetup

Neural Information Processing SystemsFeb-9-2026, 18:27:48 GMT

The source codes are implemented withPytorch 1.10.1,and We select four sub-sets as the OOD benchmark, namelyProtozoa, Microorganisms, Plants, andMollusks. Table 2 compares the performance against all thepost hocbaselines. One of the earliest work considered directly using the Maximum Softmax Probability (MSP) as the scoring function for OOD detection. In [19], the authors observed that the activations of the penultimate layer are quite different for ID and OOD data.

artificial intelligence, machine learning, rankfeat, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.37)

Add feedback

Despite their empirical success, neural networks remain vulnerable to small, adversarial perturbations. A longstanding hypothesis suggests that flat minima, regions of low curvature in the loss landscape, offer increased robustness. While intuitive, this connection has remained largely informal and incomplete. By rigorously formalizing the relationship, we show this intuition is only partially correct: flatness implies local but not global adversarial robustness. To arrive at this result, we first derive a closed-form expression for relative flatness in the penultimate layer, and then show we can use this to constrain the variation of the loss in input space. This allows us to formally analyze the adversarial robustness of the entire network. We then show that to maintain robustness beyond a local neighborhood, the loss needs to curve sharply away from the data manifold. We validate our theoretical predictions empirically across architectures and datasets, uncovering the geometric structure that governs adversarial vulnerability, and linking flatness to model confidence: adversarial examples often lie in large, flat regions where the model is confidently wrong. Our results challenge simplified views of flatness and provide a nuanced understanding of its role in robustness. Despite their success across a wide range of tasks, neural networks remain notoriously brittle under adversarial perturbations. Small, often imperceptible changes to the input can dramatically alter a model's prediction. Understanding the structural properties that contribute to this vulnerability is central to building more robust systems. One property that has long attracted attention is the flatness of the loss surface. Earlier work suggested that flatter minima correlate with better generalization (Hochreiter & Schmidhuber, 1994; Jiang et al., 2019), however, the universality of this link remains an open question (Andriushchenko et al., 2023). Flatness also emerged as a potential indicator for adversarial robustness(Wu et al., 2020): a model whose loss landscape is locally flat in parameter space might resist small perturbations in input space. At first glance, this appears to be disconnected, since adversarial examples concern the change of the loss with respect to the input, while flatness quantifies the change with respect to the weights.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.14231

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Multimodal Language Models See Better When They Look Shallower

Chen, Haoran, Lin, Junyan, Chen, Xinghao, Fan, Yue, Dong, Jianfeng, Jin, Xin, Su, Hui, Fu, Jinlan, Shen, Xiaoyu

arXiv.org Artificial IntelligenceOct-13-2025

Multimodal large language models (MLLMs) typically extract visual features from the final layers of a pretrained Vision Transformer (ViT). This widespread deep-layer bias, however, is largely driven by empirical convention rather than principled analysis. While prior studies suggest that different ViT layers capture different types of information, with shallower layers focusing on fine visual details and deeper layers aligning more closely with textual semantics, the impact of this variation on MLLM performance remains underexplored. We present the first comprehensive study of visual layer selection for MLLMs, analyzing representation similarity across ViT layers to establish shallow, middle, and deep layer groupings. Through extensive evaluation of MLLMs (1.4B-7B parameters) across 10 benchmarks encompassing 60+ tasks, we find that while deep layers excel in semantic-rich tasks like OCR, shallow and middle layers significantly outperform them on fine-grained visual tasks including counting, positioning, and object localization. Building on these insights, we propose a lightweight feature fusion method that strategically incorporates shallower layers, achieving consistent improvements over both single-layer and specialized fusion baselines. Our work offers the first principled study of visual layer selection in MLLMs, showing that MLLMs can often see better when they look shallower.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.21447

Country: Asia (0.46)

Genre: Research Report > New Finding (1.00)

Technology: